Exploiting Data-Parallelism on Multicore and SMT Systems for Implementing the Fractal Image Compressing Problem

نویسندگان

  • Rodrigo da Rosa Righi
  • Vinicius Facco Rodrigues
  • Cristiano André da Costa
  • Roberto de Quadros Gomes
چکیده

This paper presents a parallel modeling of a lossy image compression method based on the fractal theory and its evaluation over two versions of dual-core processors: with and without simultaneous multithreading (SMT) support. The idea is to observe the speedup on both configurations when changing application parameters and the number of threads at operating system level. Our target application is particularly relevant in the Big Data era. Huge amounts of data often need to be sent over low/medium bandwidth networks, and/or to be saved on devices with limited store capacity, motivating efficient image compression. Especially, the fractal compression presents a CPU-bound coding method known for offering higher indexes of file reduction through highly time-consuming calculus. The structure of the problem allowed us to explore data-parallelism by implementing an embarrassingly parallel version of the algorithm. Despite its simplicity, our modeling is useful for fully exploiting and evaluating the considered architectures. When comparing performance in both processors, the results demonstrated that the SMT-based one presented gains up to 29%. Moreover, they emphasized that a large number of threads does not always represent a reduction in application time. In average, the results showed a curve in which a strong time reduction is achieved when working with 4 and 8 threads when evaluating pure and SMT dual-core processors, respectively. The trend concerns a slow growing of the execution time when enlarging the number of threads due to both task granularity and threads management.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A multigrain Delaunay mesh generation method for multicore SMT-based architectures

Given the proliferation of layered, multicoreand SMT-based architectures, it is imperative to deploy and evaluate important, multi-level, scientific computing codes, such as meshing algorithms, on these systems. We focus on Parallel Constrained Delaunay Mesh (PCDM) generation. We exploit coarse-grain parallelism at the subdomain level, medium-grain at the cavity level and fine-grain at the elem...

متن کامل

Efficient parallel solutions to the integral knapsack problem on current chip-multiprocessor systems

The emergence of chip multiprocessor systems has dramatically increased the performance potential of computer systems. However, harnessing the full potential of these systems depends largely on the effectiveness of system software such as compilers, in exploiting the on-chip parallelism. Additionally, since the amount of parallelism extracted by a compiler is directly influenced by the selectio...

متن کامل

Massively Parallel Processing Approach to Fractal Image Compression

In the last years Image Fractal Compression techniques (IFS) have gained ever more interest because of their capability to achieve high compression ratios while maintaining very good quality for the reconstructed image. The main drawback of such techniques is the very high computing time needed to determine the compressed code. In this paper, after a brief description of the IFS theory, we disc...

متن کامل

Symmetry Breaking for Multi-criteria Mapping and Scheduling on Multicores

Multiprocessor mapping and scheduling is a long-old difficult problem. In this work we propose a new methodology to perform mapping and scheduling along with buffer memory optimization using an SMT solver. We target split-join graphs, a formalism inspired by synchronous data-flow (SDF) which provides a compact symbolic representation of data-parallelism. Unlike the traditional design flow for S...

متن کامل

Exploiting fine-grain thread parallelism on multicore architectures

In this work we present a runtime threading system which provides an efficient substrate for fine-grain parallelism, suitable for deployment in multicore platforms. Its architecture encompasses a number of optimizations that make it particularly effective in managing a large number of threads and with low overheads. The runtime system has been integrated into an OpenMP implementation to allow f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computer and Information Science

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2017